AutoCyclone: Automatic Mining of Cyclic Online Activities with Robust Tensor Factorization
نویسندگان
چکیده
Given a collection of seasonal time-series, how can we find regular (cyclic) patterns and outliers (i.e. rare events)? These two types of patterns are hidden and mixed in the time-varying activities. How can we robustly separate regular patterns and outliers, without requiring any prior information? We present CycloneM, a unifying model to capture both cyclic patterns and outliers, and CycloneFact, a novel algorithm which solves the above problem. We also present an automatic mining framework AutoCyclone, based on CycloneM and CycloneFact. Our method has the following properties; (a) effective: it captures important cyclic features such as trend and seasonality, and distinguishes regular patterns and rare events clearly; (b) robust and accurate: it detects the above features and patterns accurately against outliers; (c) fast : CycloneFact takes linear time in the data size and typically converges in a few iterations; (d) parameter free: our modeling framework frees the user from having to provide parameter values. Extensive experiments on 4 real datasets demonstrate the benefits of the proposed model and algorithm, in that the model can capture latent cyclic patterns, trends and rare events, and the algorithm outperforms the existing state-ofthe-art approaches. CycloneFact was up to 5 times more accurate and 20 times faster than top competitors.
منابع مشابه
Inert Module Extensions, Multiplicatively Closed Subsets Conserving Cyclic Submodules and Factorization in Modules
Introduction Suppose that is a commutative ring with identity, is a unitary -module and is a multiplicatively closed subset of . Factorization theory in commutative rings, which has a long history, still gets the attention of many researchers. Although at first, the focus of this theory was factorization properties of elements in integral domains, in the late nineties the theory was gener...
متن کاملFast Local Algorithms for Large Scale Nonnegative Matrix and Tensor Factorizations
Nonnegative matrix factorization (NMF) and its extensions such as Nonnegative Tensor Factorization (NTF) have become prominent techniques for blind sources separation (BSS), analysis of image databases, data mining and other information retrieval and clustering applications. In this paper we propose a family of efficient algorithms for NMF/NTF, as well as sparse nonnegative coding and represent...
متن کاملWTEN: An Advanced Coupled Tensor Factorization Strategy for Learning from Imbalanced Data
Learning from imbalanced and sparse data in multi-mode and high-dimensional tensor formats efficiently is a significant problem in data mining research. On one hand, Coupled Tensor Factorization (CTF) has become one of the most popular methods for joint analysis of heterogeneous sparse data generated from different sources. On the other hand, techniques such as sampling, cost-sensitive learning...
متن کاملDiscovering Hidden Structure in High Dimensional Human Behavioral Data via Tensor Factorization
In recent years, the rapid growth in technology has increased the opportunity for longitudinal human behavioral studies. Rich multimodal data, from wearables like Fitbit, online social networks, mobile phones etc. can be collected in natural environments. Uncovering the underlying low-dimensional structure of noisy multi-way data in an unsupervised setting is a challenging problem. Tensor facto...
متن کاملRobust Iris Recognition in Unconstrained Environments
A biometric system provides automatic identification of an individual based on a unique feature or characteristic possessed by him/her. Iris recognition (IR) is known to be the most reliable and accurate biometric identification system. The iris recognition system (IRS) consists of an automatic segmentation mechanism which is based on the Hough transform (HT). This paper presents a robust IRS i...
متن کامل